Uniformly bounded regret in the multi-secretary problem

نویسندگان

  • Alessandro Arlotto
  • Itay Gurvich
چکیده

In the secretary problem of Cayley (1875) and Moser (1956), n non-negative, independent, random variables with common distribution are sequentially presented to a decision maker who decides when to stop and collect the most recent realization. The goal is to maximize the expected value of the collected element. In the k-choice variant, the decision maker is allowed to make kď n selections to maximize the expected total value of the selected elements. Assuming that the values are drawn from a known distribution with finite support, we prove that the best regret—the expected gap between the optimal online policy and its offline counterpart in which all n values are made visible at time 0—is uniformly bounded in the the number of candidates n and the budget k. Our proof is constructive: we develop an adaptive Budget-Ratio policy that achieves this performance. The policy selects or skips values depending on where the ratio of the residual budget to the remaining time stands relative to multiple thresholds that correspond to middle points of the distribution. We also prove that being adaptive is crucial: in general, the minimal regret among non-adaptive policies grows like the square root of n. The difference is the value of adaptiveness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bounded regret in stochastic multi-armed bandits

We study the stochastic multi-armed bandit problem when one knows the value μ(⋆) of an optimal arm, as a well as a positive lower bound on the smallest positive gap ∆. We propose a new randomized policy that attains a regret uniformly bounded over time in this setting. We also prove several lower bounds, which show in particular that bounded regret is not possible if one only knows ∆, and bound...

متن کامل

A note on the Bayesian regret of Thompson Sampling with an arbitrary prior

We consider the stochastic multi-armed bandit problem with a prior distribution on the reward distributions. We show that for any prior distribution, the Thompson Sampling strategy achieves a Bayesian regret bounded from above by 14 √ nK. This result is unimprovable in the sense that there exists a prior distribution such that any algorithm has a Bayesian regret bounded from below by 1 20 √ nK....

متن کامل

How to Beat the Adaptive Multi-Armed Bandit

The multi-armed bandit is a concise model for the problem of iterated decision-making under uncertainty. In each round, a gambler must pull one of K arms of a slot machine, without any foreknowledge of their payouts, except that they are uniformly bounded. A standard objective is to minimize the gambler’s regret, de ned as the gambler’s total payout minus the largest payout which would have bee...

متن کامل

Adaptive Distributed Consensus Control for a Class of Heterogeneous and Uncertain Nonlinear Multi-Agent Systems

This paper has been devoted to the design of a distributed consensus control for a class of uncertain nonlinear multi-agent systems in the strict-feedback form. The communication between the agents has been described by a directed graph. Radial-basis function neural networks have been used for the approximation of the uncertain and heterogeneous dynamics of the followers as well as the effect o...

متن کامل

Beyond the Hazard Rate: More Perturbation Algorithms for Adversarial Multi-armed Bandits

Recent work on follow the perturbed leader (FTPL) algorithms for the adversarial multi-armed bandit problem has highlighted the role of the hazard rate of the distribution generating the perturbations. Assuming that the hazard rate is bounded, it is possible to provide regret analyses for a variety of FTPL algorithms for the multi-armed bandit problem. This paper pushes the inquiry into regret ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.07719  شماره 

صفحات  -

تاریخ انتشار 2017